Parallel and Hierarchical Decision Making for Sparse Coding in Speech Recognition

نویسندگان

  • Dong Wang
  • Ravichander Vipperla
  • Nicholas W. D. Evans
چکیده

Sparse coding exhibits promising performance in speech processing, mainly due to the large number of bases that can be used to represent speech signals. However, the high demand for computational power presents a major obstacle in the case of large datasets, as does the difficulty in utilising information scattered sparsely in high dimensional features. This paper reports the use of an online dictionary learning technique, proposed recently by the machine learning community, to learn large scale bases efficiently, and proposes a new parallel and hierarchical architecture to make use of the sparse information in high dimensional features. The approach uses multilayer perceptrons (MLPs) to model sparse feature subspaces and make local decisions accordingly; the latter are integrated by additional MLPs in a hierarchical way for making global decisions. Experiments on the WSJ database show that the proposed approach not only solves the problem of prohibitive computation with large-dimensional sparse features, but also provides better performance in a frame-level phone prediction task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Face Recognition using an Affine Sparse Coding approach

Sparse coding is an unsupervised method which learns a set of over-complete bases to represent data such as image and video. Sparse coding has increasing attraction for image classification applications in recent years. But in the cases where we have some similar images from different classes, such as face recognition applications, different images may be classified into the same class, and hen...

متن کامل

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

Hierarchical Recognition of Sparse Patterns in Large-scale Simultaneous Inference

We study how to accurately separate signals from noisy data and determine the patterns of the selected signals. Controlling the inflation of false positive errors is an important issue in largescale simultaneous inference but has not been addressed in the pattern recognition literature. We develop a decision-theoretic framework and formulate the sparse pattern recognition problem as a simultane...

متن کامل

Continuous speech recognition with sparse coding

Sparse coding is an efficient way of coding information. In a sparse code most of the code elements are zero; very few are active. Sparse codes are intended to correspond to the spike trains with which biological neurons communicate. In this article, we show how sparse codes can be used to do continuous speech recognition. We use the TIDIGITS dataset to illustrate the process. First a waveform ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011